The Multimodal AI Healthcare Chatbot with Symptom Analysis, Image-Based Detection and Severity Classification aims at providing intelligent assistance in healthcare through Multimodal Large Language Model (MLLM) based reasoning of AI. In the proposed solution, the user inputs the symptoms and also uploads images which are processed by the algorithm for detection of possible health issues and categorizing them into three classes: Mild, Moderate and Severe. Based upon the level of severity, the algorithm makes suitable medical recommendations for health decisions. While LLM-based healthcare bots depend only on text-based processing of symptoms for their operations, the Multimodal Healthcare AI is characterized by symptom analysis, image-based detection and severity classification. Experimental results prove that the proposed approach achieves an accuracy of 92%, whereas the LLM based bot performs with 85% accuracy, giving an improvement of 7%.Top of Form Bottom of Form
Introduction
Traditional healthcare chatbots mostly focus only on text-based symptom checking, which creates gaps such as lack of image analysis, severity detection, and doctor recommendations. To overcome this, the proposed system integrates multiple AI capabilities including symptom analysis, medical image processing, severity classification, and personalized doctor suggestions.
The system uses multimodal AI (LLama 3 Vision via Groq API) to process both text symptoms and medical images. It is supported by datasets from Kaggle for symptom data, disease prediction, and skin disease images. The chatbot also uses a Gradio interface for user interaction and includes text-to-speech (TTS) features to provide spoken medical guidance.
After analyzing user input, the system classifies conditions into mild, moderate, or severe, and recommends appropriate medical actions or doctors. A rule-based logic layer further refines severity detection and recommendations.
Evaluation results show that the multimodal system (MLLM) outperforms traditional LLM-based chatbots. It achieves higher performance across all metrics, including 92% accuracy compared to 85%, along with improved precision, recall, sensitivity, and specificity. It also shows better classification of mild, moderate, and severe cases.
Conclusion
The Multimodal AI Healthcare Chatbot is designed to offer intelligent real-time healthcare support using Multimodal LLM-based reasoning, symptom detection, image detection, severity classification, doctor recommendation, and Text-to-Speech (TTS). It uses the inputs of the symptoms and medical images to diagnose the health issues and recommend the right medical advice, whereas the severity classification helps make informed decisions and prioritize severe cases.
According to the evaluation results, the Multimodal AI Healthcare Chatbot significantly outperforms the current LLM-based model. The system achieves an accuracy rate of 92%, a precision of 90%, a recall of 91%, a sensitivity of 93%, and a specificity of 89%. Hence, there is an average 7% improvement across all measures.
References
[1] Ayers, J. W., Poliak, A., Dredze, M., et al. (2023). Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum. JAMA Internal Medicine
[2] Sallam, M. (2023). ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review. Healthcare (MDPI), 11(6), 887
[3] Cascella, M., Montomoli, J., Bellini, V., et al. (2023). Evaluating the Feasibility of ChatGPT in Healthcare: Systematic Review. Healthcare (MDPI), 11(4), 547
[4] Kung, T. H., Cheatham, M., Medenilla, A., et al. (2023). Performance of ChatGPT on USMLE: Potential for AI-assisted Medical Education. PLOS Digital Health
[5] Thirunavukarasu, A. J., Hassan, R,Mahmood, S., et al. (2023). Trialling ChatGPT in Healthcare: Review of Applications. BMJ Health & Care Informatics
[6] Lee, P., Bubeck, S., Petro, J. (2023). Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine. New England Journal of Medicine
[7] Nadarzynski, T., Miles, O., Cowie, A., et al. (2022). Acceptability of Artificial Intelligence-Led Chatbots in Healthcare. Digital Health Journal
[8] Kocaballi, A. B., Quiroz, J. C., Rezazadegan,D.,etal. (2022). Conversational Agents for Healthcare: Systematic Review. npj Digital Medicine WHO (2023). Ethics and Governance
[9] Gilson, A., Safranek, C. W., Huang, T., et al. (2023)How Does ChatGPT Perform on the United States Medical Licensing Examination?PLOS Digital Health
[10] Singhal, K., Azizi, S., Tu, T., et al. (2023). Large Language Models Encode Clinical Knowledge. Nature.
[11] Patel, S. B., & Lam, K. (2023). ChatGPT: The Future of Medical Education and Clinical Practice? The Lancet Digital Health.
[12] Rao, A., Kim, J., Kamineni, M., et al. (2023). Evaluating ChatGPT as an Adjunct for Radiologic Decision-Making. Radiology.
[13] Meskó, B., & Topol, E. (2023). The Imperative for Artificial Intelligence in Healthcare. npj Digital Medicine.
[14] Anjum, Hiba, B. Sasidhar, and T. Rajesh. \"Lung Nodule Segmentation and Classification using Image Processing and Deep Learning Techniques.\" Grenze International Journal of Engineering & Technology (GIJET) 11 (2025).
[15] Medical Chatbot Dataset, Kagglehttps://www.kaggle.com/datasets/saifulislamsarfaraz/medical-chatbot-dataset
[16] Symptom-Based Disease Prediction Dataset, Kaggle https://www.kaggle.com/datasets/miltonmacgyver/symptom-based-disease-prediction-dataset
[17] Skin Disease Image Dataset, Kaggle https://www.kaggle.com/datasets/ismailpromus/skin-diseases-image-dataset